Hierarchical Phrase-Based MT for Phonetic Representation-Based Speech Translation
نویسندگان
چکیده
The paper presents a novel technique for speech translation using hierarchical phrasedbased statistical machine translation (HPBSMT). The system is based on translation of speech from phone sequences as opposed to conventional approach of speech translation from word sequences. The technique facilitates speech translation by allowing a machine translation (MT) system to access to phonetic information. This enables the MT system to act as both a word recognition and a translation component. This results in better performance than conventional speech translation approaches by recovering from recognition error with help of a source language model, translation model and target language model. For this purpose, the MT translation models are adopted to work on source language phones using a grapheme-tophoneme component. The source-side phonetic confusions are handled using a confusion network. The result on IWLST'10 EnglishChinese translation task shows a significant improvement in translation quality. In this paper, results for HPB-SMT are compared with previously published results of phrase-based statistical machine translation (PB-SMT) system (Baseline). The HPB-SMT system outperforms PB-SMT in this regard.
منابع مشابه
Phonetic Representation-Based Speech Translation
This paper explores a tight coupling of Automatic Speech Recognition (ASR) and Machine Translation (MT) for speech translation with information sharing on the phonelevel. Our novel approach allows MT to access fine-grained phonetic information from ASR, as a methodology for facilitating speech translation. Specifically, Phrase-based Statistical MT (PBSMT) models are adapted to work on source la...
متن کاملThe RWTH Aachen Speech Recognition and Machine Translation System for IWSLT
In this paper, the automatic speech recognition (ASR) and statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2012 are presented. We participated in the ASR (English), MT (English-French, Arabic-English, ChineseEnglish, German-English) and SLT (English-French) tracks. F...
متن کاملThe RWTH Aachen speech recognition and machine translation system for IWSLT 2012
In this paper, the automatic speech recognition (ASR) and statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2012 are presented. We participated in the ASR (English), MT (English-French, Arabic-English, ChineseEnglish, German-English) and SLT (English-French) tracks. F...
متن کاملPhrase Based Language Model For Statistical Machine Translation
We consider phrase based Language Models (LM), which generalize the commonly used word level models. Similar concept on phrase based LMs appears in speech recognition, which is rather specialized and thus less suitable for machine translation (MT). In contrast to the dependency LM, we first introduce the exhaustive phrase-based LMs tailored for MT use. Preliminary experimental results show that...
متن کامل